Dataset statistics
| Number of variables | 22 |
|---|---|
| Number of observations | 389003 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 76.3 MiB |
| Average record size in memory | 205.7 B |
Variable types
| DateTime | 2 |
|---|---|
| Numeric | 9 |
| Categorical | 11 |
merchant has a high cardinality: 693 distinct values | High cardinality |
first has a high cardinality: 352 distinct values | High cardinality |
last has a high cardinality: 481 distinct values | High cardinality |
street has a high cardinality: 982 distinct values | High cardinality |
city has a high cardinality: 894 distinct values | High cardinality |
job has a high cardinality: 494 distinct values | High cardinality |
trans_num has a high cardinality: 389003 distinct values | High cardinality |
cc_num is highly overall correlated with gender and 1 other fields | High correlation |
zip is highly overall correlated with state and 4 other fields | High correlation |
lat is highly overall correlated with state and 4 other fields | High correlation |
long is highly overall correlated with state and 4 other fields | High correlation |
merch_lat is highly overall correlated with state and 4 other fields | High correlation |
merch_long is highly overall correlated with state and 4 other fields | High correlation |
gender is highly overall correlated with cc_num and 1 other fields | High correlation |
state is highly overall correlated with zip and 5 other fields | High correlation |
city_pop is highly overall correlated with state | High correlation |
is_fraud is highly imbalanced (94.9%) | Imbalance |
amt is highly skewed (γ1 = 55.16028174) | Skewed |
trans_num is uniformly distributed | Uniform |
trans_num has unique values | Unique |
Reproduction
| Analysis started | 2023-05-17 00:06:20.706651 |
|---|---|
| Analysis finished | 2023-05-17 00:08:22.813799 |
| Duration | 2 minutes and 2.11 seconds |
| Software version | ydata-profiling vv4.1.2 |
| Download configuration | config.json |
| Distinct | 386997 |
|---|---|
| Distinct (%) | 99.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 14.0 MiB |
| Minimum | 2019-01-01 00:00:51 |
|---|---|
| Maximum | 2020-06-21 12:13:36 |
cc_num
Real number (ℝ)
| Distinct | 982 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.185546 × 1017 |
| Minimum | 6.0416207 × 1010 |
|---|---|
| Maximum | 4.9923464 × 1018 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 14.0 MiB |
Quantile statistics
| Minimum | 6.0416207 × 1010 |
|---|---|
| 5-th percentile | 6.3048488 × 1011 |
| Q1 | 1.8004295 × 1014 |
| median | 3.5214173 × 1015 |
| Q3 | 4.6422555 × 1015 |
| 95-th percentile | 4.5025395 × 1018 |
| Maximum | 4.9923464 × 1018 |
| Range | 4.9923463 × 1018 |
| Interquartile range (IQR) | 4.4622125 × 1015 |
Descriptive statistics
| Standard deviation | 1.3111107 × 1018 |
|---|---|
| Coefficient of variation (CV) | 3.1324724 |
| Kurtosis | 6.1495469 |
| Mean | 4.185546 × 1017 |
| Median Absolute Deviation (MAD) | 3.0764709 × 1015 |
| Skewness | 2.8465774 |
| Sum | 8.0315113 × 1018 |
| Variance | 1.7190114 × 1036 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 4.512828415 × 1018 | 1009 | 0.3% |
| 3.764452668 × 1014 | 976 | 0.3% |
| 6.304249875 × 1011 | 968 | 0.2% |
| 3.575789282 × 1015 | 958 | 0.2% |
| 6.534628261 × 1015 | 956 | 0.2% |
| 2.720433096 × 1015 | 946 | 0.2% |
| 6.011438889 × 1015 | 945 | 0.2% |
| 4.792627764 × 1018 | 945 | 0.2% |
| 3.54510934 × 1015 | 944 | 0.2% |
| 4.716561797 × 1015 | 942 | 0.2% |
| Other values (972) | 379414 |
| Value | Count | Frequency (%) |
| 6.041620718 × 1010 | 436 | |
| 6.042292873 × 1010 | 439 | |
| 6.042309813 × 1010 | 162 | < 0.1% |
| 6.042785159 × 1010 | 155 | < 0.1% |
| 6.048700208 × 1010 | 151 | < 0.1% |
| 6.04905963 × 1010 | 294 | |
| 6.049559311 × 1010 | 152 | < 0.1% |
| 5.018029536 × 1011 | 509 | |
| 5.018181333 × 1011 | 2 | < 0.1% |
| 5.018282048 × 1011 | 145 | < 0.1% |
| Value | Count | Frequency (%) |
| 4.992346398 × 1018 | 654 | |
| 4.989847571 × 1018 | 295 | 0.1% |
| 4.980323468 × 1018 | 151 | < 0.1% |
| 4.973530368 × 1018 | 309 | 0.1% |
| 4.958589672 × 1018 | 458 | |
| 4.95682899 × 1018 | 814 | |
| 4.911818931 × 1018 | 4 | < 0.1% |
| 4.906628656 × 1018 | 780 | |
| 4.897067971 × 1018 | 323 | 0.1% |
| 4.890424427 × 1018 | 459 |
merchant
Categorical
| Distinct | 693 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 14.0 MiB |
| fraud_Kilback LLC | 1268 |
|---|---|
| fraud_Cormier LLC | 1093 |
| fraud_Schumm PLC | 1081 |
| fraud_Dickinson Ltd | 1061 |
| fraud_Kuhn LLC | 1034 |
| Other values (688) |
Length
| Max length | 43 |
|---|---|
| Median length | 36 |
| Mean length | 23.133667 |
| Min length | 13 |
Characters and Unicode
| Total characters | 8999066 |
|---|---|
| Distinct characters | 55 |
| Distinct categories | 6 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | fraud_Hills-Witting |
|---|---|
| 2nd row | fraud_Kling Inc |
| 3rd row | fraud_DuBuque LLC |
| 4th row | fraud_Grimes LLC |
| 5th row | fraud_Ullrich Ltd |
Common Values
| Value | Count | Frequency (%) |
| fraud_Kilback LLC | 1268 | 0.3% |
| fraud_Cormier LLC | 1093 | 0.3% |
| fraud_Schumm PLC | 1081 | 0.3% |
| fraud_Dickinson Ltd | 1061 | 0.3% |
| fraud_Kuhn LLC | 1034 | 0.3% |
| fraud_Boyer PLC | 999 | 0.3% |
| fraud_Prohaska-Murray | 838 | 0.2% |
| fraud_Olson, Becker and Koch | 834 | 0.2% |
| fraud_Stroman, Hudson and Erdman | 834 | 0.2% |
| fraud_Emard Inc | 831 | 0.2% |
| Other values (683) | 379130 |
Length
| Value | Count | Frequency (%) |
| and | 142354 | 15.7% |
| llc | 29295 | 3.2% |
| inc | 27595 | 3.0% |
| sons | 22046 | 2.4% |
| ltd | 21298 | 2.3% |
| plc | 19732 | 2.2% |
| group | 15283 | 1.7% |
| fraud_kutch | 3191 | 0.4% |
| fraud_schaefer | 2855 | 0.3% |
| fraud_streich | 2761 | 0.3% |
| Other values (804) | 620812 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 872715 | 9.7% |
| r | 808746 | 9.0% |
| d | 641831 | 7.1% |
| e | 559186 | 6.2% |
| u | 557157 | 6.2% |
| n | 530288 | 5.9% |
| 518219 | 5.8% | |
| f | 419227 | 4.7% |
| _ | 389003 | 4.3% |
| o | 339376 | 3.8% |
| Other values (45) | 3363318 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 6810376 | |
| Uppercase Letter | 1019016 | 11.3% |
| Space Separator | 518219 | 5.8% |
| Connector Punctuation | 389003 | 4.3% |
| Dash Punctuation | 133446 | 1.5% |
| Other Punctuation | 129006 | 1.4% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 872715 | |
| r | 808746 | |
| d | 641831 | |
| e | 559186 | 8.2% |
| u | 557157 | 8.2% |
| n | 530288 | 7.8% |
| f | 419227 | 6.2% |
| o | 339376 | 5.0% |
| i | 324292 | 4.8% |
| t | 262990 | 3.9% |
| Other values (15) | 1494568 |
Uppercase Letter
| Value | Count | Frequency (%) |
| L | 142762 | |
| C | 93391 | 9.2% |
| S | 90718 | 8.9% |
| B | 83325 | 8.2% |
| H | 78015 | 7.7% |
| K | 65247 | 6.4% |
| G | 57896 | 5.7% |
| R | 54548 | 5.4% |
| M | 53836 | 5.3% |
| P | 47587 | 4.7% |
| Other values (15) | 251691 |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 120308 | |
| ' | 8698 | 6.7% |
Space Separator
| Value | Count | Frequency (%) |
| 518219 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 389003 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 133446 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 7829392 | |
| Common | 1169674 | 13.0% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 872715 | 11.1% |
| r | 808746 | 10.3% |
| d | 641831 | 8.2% |
| e | 559186 | 7.1% |
| u | 557157 | 7.1% |
| n | 530288 | 6.8% |
| f | 419227 | 5.4% |
| o | 339376 | 4.3% |
| i | 324292 | 4.1% |
| t | 262990 | 3.4% |
| Other values (40) | 2513584 |
Common
| Value | Count | Frequency (%) |
| 518219 | ||
| _ | 389003 | |
| - | 133446 | 11.4% |
| , | 120308 | 10.3% |
| ' | 8698 | 0.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 8999066 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| a | 872715 | 9.7% |
| r | 808746 | 9.0% |
| d | 641831 | 7.1% |
| e | 559186 | 6.2% |
| u | 557157 | 6.2% |
| n | 530288 | 5.9% |
| 518219 | 5.8% | |
| f | 419227 | 4.7% |
| _ | 389003 | 4.3% |
| o | 339376 | 3.8% |
| Other values (45) | 3363318 |
category
Categorical
| Distinct | 14 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 14.0 MiB |
| gas_transport | |
|---|---|
| grocery_pos | |
| home | |
| shopping_pos | |
| kids_pets | |
| Other values (9) |
Length
| Max length | 14 |
|---|---|
| Median length | 12 |
| Mean length | 10.523646 |
| Min length | 4 |
Characters and Unicode
| Total characters | 4093730 |
|---|---|
| Distinct characters | 20 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | shopping_net |
|---|---|
| 2nd row | gas_transport |
| 3rd row | grocery_pos |
| 4th row | entertainment |
| 5th row | kids_pets |
Common Values
| Value | Count | Frequency (%) |
| gas_transport | 39239 | |
| grocery_pos | 37261 | |
| home | 37029 | |
| shopping_pos | 34789 | |
| kids_pets | 33961 | |
| shopping_net | 29244 | |
| entertainment | 28341 | |
| food_dining | 27464 | |
| personal_care | 27394 | 7.0% |
| health_fitness | 25748 | 6.6% |
| Other values (4) | 68533 |
Length
| Value | Count | Frequency (%) |
| gas_transport | 39239 | |
| grocery_pos | 37261 | |
| home | 37029 | |
| shopping_pos | 34789 | |
| kids_pets | 33961 | |
| shopping_net | 29244 | |
| entertainment | 28341 | |
| food_dining | 27464 | |
| personal_care | 27394 | 7.0% |
| health_fitness | 25748 | 6.6% |
| Other values (4) | 68533 |
Most occurring characters
| Value | Count | Frequency (%) |
| s | 427903 | |
| e | 387087 | |
| o | 369314 | |
| n | 358077 | |
| p | 324524 | 7.9% |
| t | 322921 | 7.9% |
| _ | 311382 | 7.6% |
| r | 275512 | 6.7% |
| i | 249727 | 6.1% |
| a | 199606 | 4.9% |
| Other values (10) | 867677 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 3782348 | |
| Connector Punctuation | 311382 | 7.6% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| s | 427903 | |
| e | 387087 | |
| o | 369314 | |
| n | 358077 | |
| p | 324524 | |
| t | 322921 | |
| r | 275512 | |
| i | 249727 | 6.6% |
| a | 199606 | 5.3% |
| g | 181563 | 4.8% |
| Other values (9) | 686114 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 311382 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 3782348 | |
| Common | 311382 | 7.6% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| s | 427903 | |
| e | 387087 | |
| o | 369314 | |
| n | 358077 | |
| p | 324524 | |
| t | 322921 | |
| r | 275512 | |
| i | 249727 | 6.6% |
| a | 199606 | 5.3% |
| g | 181563 | 4.8% |
| Other values (9) | 686114 |
Common
| Value | Count | Frequency (%) |
| _ | 311382 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 4093730 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| s | 427903 | |
| e | 387087 | |
| o | 369314 | |
| n | 358077 | |
| p | 324524 | 7.9% |
| t | 322921 | 7.9% |
| _ | 311382 | 7.6% |
| r | 275512 | 6.7% |
| i | 249727 | 6.1% |
| a | 199606 | 4.9% |
| Other values (10) | 867677 |
amt
Real number (ℝ)
| Distinct | 33399 |
|---|---|
| Distinct (%) | 8.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 70.249035 |
| Minimum | 1 |
|---|---|
| Maximum | 28948.9 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 14.0 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 2.45 |
| Q1 | 9.64 |
| median | 47.5 |
| Q3 | 83.06 |
| 95-th percentile | 195.689 |
| Maximum | 28948.9 |
| Range | 28947.9 |
| Interquartile range (IQR) | 73.42 |
Descriptive statistics
| Standard deviation | 169.32887 |
|---|---|
| Coefficient of variation (CV) | 2.4104086 |
| Kurtosis | 6777.7869 |
| Mean | 70.249035 |
| Median Absolute Deviation (MAD) | 37.5 |
| Skewness | 55.160282 |
| Sum | 27327085 |
| Variance | 28672.268 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1.1 | 180 | < 0.1% |
| 1.14 | 179 | < 0.1% |
| 1.31 | 167 | < 0.1% |
| 1.12 | 163 | < 0.1% |
| 1.03 | 161 | < 0.1% |
| 1.08 | 161 | < 0.1% |
| 1.25 | 158 | < 0.1% |
| 1.4 | 158 | < 0.1% |
| 1.17 | 157 | < 0.1% |
| 1.16 | 156 | < 0.1% |
| Other values (33389) | 387363 |
| Value | Count | Frequency (%) |
| 1 | 59 | < 0.1% |
| 1.01 | 154 | |
| 1.02 | 142 | |
| 1.03 | 161 | |
| 1.04 | 140 | |
| 1.05 | 155 | |
| 1.06 | 134 | |
| 1.07 | 145 | |
| 1.08 | 161 | |
| 1.09 | 138 |
| Value | Count | Frequency (%) |
| 28948.9 | 1 | |
| 27119.77 | 1 | |
| 26544.12 | 1 | |
| 17897.24 | 1 | |
| 15305.95 | 1 | |
| 14849.74 | 1 | |
| 14238.11 | 1 | |
| 12176.55 | 1 | |
| 12025.3 | 1 | |
| 11872.21 | 1 |
first
Categorical
| Distinct | 352 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 14.0 MiB |
| Christopher | 7987 |
|---|---|
| Robert | 6604 |
| James | 6151 |
| Jessica | 6108 |
| Michael | 6013 |
| Other values (347) |
Length
| Max length | 11 |
|---|---|
| Median length | 9 |
| Mean length | 6.0823515 |
| Min length | 3 |
Characters and Unicode
| Total characters | 2366053 |
|---|---|
| Distinct characters | 49 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 2 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | Amber |
|---|---|
| 2nd row | Robert |
| 3rd row | Kathryn |
| 4th row | Phillip |
| 5th row | Vanessa |
Common Values
| Value | Count | Frequency (%) |
| Christopher | 7987 | 2.1% |
| Robert | 6604 | 1.7% |
| James | 6151 | 1.6% |
| Jessica | 6108 | 1.6% |
| Michael | 6013 | 1.5% |
| David | 5965 | 1.5% |
| Jennifer | 5168 | 1.3% |
| William | 4986 | 1.3% |
| John | 4965 | 1.3% |
| Mary | 4825 | 1.2% |
| Other values (342) | 330231 |
Length
| Value | Count | Frequency (%) |
| christopher | 7987 | 2.1% |
| robert | 6604 | 1.7% |
| james | 6151 | 1.6% |
| jessica | 6108 | 1.6% |
| michael | 6013 | 1.5% |
| david | 5965 | 1.5% |
| jennifer | 5168 | 1.3% |
| william | 4986 | 1.3% |
| john | 4965 | 1.3% |
| mary | 4825 | 1.2% |
| Other values (342) | 330231 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 301673 | 12.8% |
| e | 258513 | 10.9% |
| i | 185377 | 7.8% |
| n | 183962 | 7.8% |
| r | 182008 | 7.7% |
| l | 116715 | 4.9% |
| h | 103744 | 4.4% |
| s | 97785 | 4.1% |
| t | 93652 | 4.0% |
| o | 80985 | 3.4% |
| Other values (39) | 761639 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 1977050 | |
| Uppercase Letter | 389003 | 16.4% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 301673 | |
| e | 258513 | |
| i | 185377 | |
| n | 183962 | |
| r | 182008 | |
| l | 116715 | 5.9% |
| h | 103744 | 5.2% |
| s | 97785 | 4.9% |
| t | 93652 | 4.7% |
| o | 80985 | 4.1% |
| Other values (16) | 372636 |
Uppercase Letter
| Value | Count | Frequency (%) |
| J | 65790 | |
| M | 43564 | |
| S | 34243 | |
| A | 33507 | |
| C | 31999 | |
| K | 25703 | 6.6% |
| D | 25634 | 6.6% |
| R | 21292 | 5.5% |
| T | 19823 | 5.1% |
| L | 18831 | 4.8% |
| Other values (13) | 68617 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 2366053 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 301673 | 12.8% |
| e | 258513 | 10.9% |
| i | 185377 | 7.8% |
| n | 183962 | 7.8% |
| r | 182008 | 7.7% |
| l | 116715 | 4.9% |
| h | 103744 | 4.4% |
| s | 97785 | 4.1% |
| t | 93652 | 4.0% |
| o | 80985 | 3.4% |
| Other values (39) | 761639 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2366053 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| a | 301673 | 12.8% |
| e | 258513 | 10.9% |
| i | 185377 | 7.8% |
| n | 183962 | 7.8% |
| r | 182008 | 7.7% |
| l | 116715 | 4.9% |
| h | 103744 | 4.4% |
| s | 97785 | 4.1% |
| t | 93652 | 4.0% |
| o | 80985 | 3.4% |
| Other values (39) | 761639 |
last
Categorical
| Distinct | 481 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 14.0 MiB |
| Smith | 8711 |
|---|---|
| Williams | 6981 |
| Davis | 6497 |
| Johnson | 5844 |
| Rodriguez | 5208 |
| Other values (476) |
Length
| Max length | 11 |
|---|---|
| Median length | 10 |
| Mean length | 6.107228 |
| Min length | 2 |
Characters and Unicode
| Total characters | 2375730 |
|---|---|
| Distinct characters | 48 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 2 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | Lewis |
|---|---|
| 2nd row | James |
| 3rd row | Smith |
| 4th row | Robertson |
| 5th row | Anderson |
Common Values
| Value | Count | Frequency (%) |
| Smith | 8711 | 2.2% |
| Williams | 6981 | 1.8% |
| Davis | 6497 | 1.7% |
| Johnson | 5844 | 1.5% |
| Rodriguez | 5208 | 1.3% |
| Martinez | 4505 | 1.2% |
| Jones | 4151 | 1.1% |
| Lewis | 3749 | 1.0% |
| Gonzalez | 3549 | 0.9% |
| Miller | 3513 | 0.9% |
| Other values (471) | 336295 |
Length
| Value | Count | Frequency (%) |
| smith | 8711 | 2.2% |
| williams | 6981 | 1.8% |
| davis | 6497 | 1.7% |
| johnson | 5844 | 1.5% |
| rodriguez | 5208 | 1.3% |
| martinez | 4505 | 1.2% |
| jones | 4151 | 1.1% |
| lewis | 3749 | 1.0% |
| gonzalez | 3549 | 0.9% |
| miller | 3513 | 0.9% |
| Other values (471) | 336295 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 236267 | 9.9% |
| r | 197779 | 8.3% |
| a | 194214 | 8.2% |
| n | 182474 | 7.7% |
| o | 174868 | 7.4% |
| l | 146587 | 6.2% |
| s | 145591 | 6.1% |
| i | 130221 | 5.5% |
| t | 86481 | 3.6% |
| h | 68251 | 2.9% |
| Other values (38) | 812997 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 1986727 | |
| Uppercase Letter | 389003 | 16.4% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 236267 | |
| r | 197779 | |
| a | 194214 | |
| n | 182474 | |
| o | 174868 | |
| l | 146587 | 7.4% |
| s | 145591 | 7.3% |
| i | 130221 | 6.6% |
| t | 86481 | 4.4% |
| h | 68251 | 3.4% |
| Other values (15) | 423994 |
Uppercase Letter
| Value | Count | Frequency (%) |
| M | 47471 | |
| W | 32079 | 8.2% |
| S | 31511 | 8.1% |
| C | 27961 | 7.2% |
| B | 25297 | 6.5% |
| R | 25089 | 6.4% |
| H | 24370 | 6.3% |
| G | 22694 | 5.8% |
| J | 21160 | 5.4% |
| P | 19706 | 5.1% |
| Other values (13) | 111665 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 2375730 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 236267 | 9.9% |
| r | 197779 | 8.3% |
| a | 194214 | 8.2% |
| n | 182474 | 7.7% |
| o | 174868 | 7.4% |
| l | 146587 | 6.2% |
| s | 145591 | 6.1% |
| i | 130221 | 5.5% |
| t | 86481 | 3.6% |
| h | 68251 | 2.9% |
| Other values (38) | 812997 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2375730 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 236267 | 9.9% |
| r | 197779 | 8.3% |
| a | 194214 | 8.2% |
| n | 182474 | 7.7% |
| o | 174868 | 7.4% |
| l | 146587 | 6.2% |
| s | 145591 | 6.1% |
| i | 130221 | 5.5% |
| t | 86481 | 3.6% |
| h | 68251 | 2.9% |
| Other values (38) | 812997 |
gender
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 14.0 MiB |
| F | |
|---|---|
| M |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 389003 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | F |
|---|---|
| 2nd row | M |
| 3rd row | F |
| 4th row | M |
| 5th row | F |
Common Values
| Value | Count | Frequency (%) |
| F | 212679 | |
| M | 176324 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| f | 212679 | |
| m | 176324 |
Most occurring characters
| Value | Count | Frequency (%) |
| F | 212679 | |
| M | 176324 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 389003 |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| F | 212679 | |
| M | 176324 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 389003 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| F | 212679 | |
| M | 176324 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 389003 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| F | 212679 | |
| M | 176324 |
street
Categorical
| Distinct | 982 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 14.0 MiB |
| 864 Reynolds Plains | 1009 |
|---|---|
| 372 Jeffrey Course | 976 |
| 2870 Bean Terrace Apt. 756 | 968 |
| 7618 Gonzales Mission | 958 |
| 29606 Martinez Views Suite 653 | 956 |
| Other values (977) |
Length
| Max length | 35 |
|---|---|
| Median length | 29 |
| Mean length | 22.225983 |
| Min length | 12 |
Characters and Unicode
| Total characters | 8645974 |
|---|---|
| Distinct characters | 62 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 11 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | 6296 John Keys Suite 858 |
|---|---|
| 2nd row | 18316 Cannon Place |
| 3rd row | 19838 Tonya Prairie Apt. 947 |
| 4th row | 85344 Smith Gateway Apt. 280 |
| 5th row | 21178 Brittney Locks |
Common Values
| Value | Count | Frequency (%) |
| 864 Reynolds Plains | 1009 | 0.3% |
| 372 Jeffrey Course | 976 | 0.3% |
| 2870 Bean Terrace Apt. 756 | 968 | 0.2% |
| 7618 Gonzales Mission | 958 | 0.2% |
| 29606 Martinez Views Suite 653 | 956 | 0.2% |
| 854 Walker Dale Suite 488 | 946 | 0.2% |
| 40624 Rebecca Spurs | 945 | 0.2% |
| 7952 Karen Pike | 945 | 0.2% |
| 8030 Beck Motorway | 944 | 0.2% |
| 11014 Chad Lake Apt. 573 | 942 | 0.2% |
| Other values (972) | 379414 |
Length
| Value | Count | Frequency (%) |
| apt | 98165 | 6.3% |
| suite | 91491 | 5.9% |
| island | 6981 | 0.5% |
| michael | 5625 | 0.4% |
| station | 5376 | 0.3% |
| common | 5349 | 0.3% |
| islands | 5319 | 0.3% |
| david | 5215 | 0.3% |
| brooks | 5105 | 0.3% |
| fields | 4966 | 0.3% |
| Other values (1938) | 1312729 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1157318 | 13.4% | |
| e | 537671 | 6.2% |
| a | 436559 | 5.0% |
| i | 389799 | 4.5% |
| t | 374730 | 4.3% |
| r | 331150 | 3.8% |
| n | 320223 | 3.7% |
| s | 310614 | 3.6% |
| l | 266766 | 3.1% |
| o | 262407 | 3.0% |
| Other values (52) | 4258737 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 4325186 | |
| Decimal Number | 2097643 | |
| Space Separator | 1157318 | 13.4% |
| Uppercase Letter | 967662 | 11.2% |
| Other Punctuation | 98165 | 1.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 537671 | |
| a | 436559 | |
| i | 389799 | |
| t | 374730 | |
| r | 331150 | 7.7% |
| n | 320223 | 7.4% |
| s | 310614 | 7.2% |
| l | 266766 | 6.2% |
| o | 262407 | 6.1% |
| u | 184498 | 4.3% |
| Other values (16) | 910769 |
Uppercase Letter
| Value | Count | Frequency (%) |
| S | 168431 | |
| A | 126251 | |
| M | 77421 | 8.0% |
| C | 67442 | 7.0% |
| P | 59081 | 6.1% |
| R | 55929 | 5.8% |
| B | 44661 | 4.6% |
| F | 42866 | 4.4% |
| L | 39577 | 4.1% |
| J | 36221 | 3.7% |
| Other values (14) | 249782 |
Decimal Number
| Value | Count | Frequency (%) |
| 5 | 224471 | |
| 3 | 221250 | |
| 2 | 220401 | |
| 7 | 211589 | |
| 1 | 208271 | |
| 8 | 207692 | |
| 6 | 203782 | |
| 0 | 202913 | |
| 4 | 200145 | |
| 9 | 197129 |
Space Separator
| Value | Count | Frequency (%) |
| 1157318 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 98165 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 5292848 | |
| Common | 3353126 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 537671 | 10.2% |
| a | 436559 | 8.2% |
| i | 389799 | 7.4% |
| t | 374730 | 7.1% |
| r | 331150 | 6.3% |
| n | 320223 | 6.1% |
| s | 310614 | 5.9% |
| l | 266766 | 5.0% |
| o | 262407 | 5.0% |
| u | 184498 | 3.5% |
| Other values (40) | 1878431 |
Common
| Value | Count | Frequency (%) |
| 1157318 | ||
| 5 | 224471 | 6.7% |
| 3 | 221250 | 6.6% |
| 2 | 220401 | 6.6% |
| 7 | 211589 | 6.3% |
| 1 | 208271 | 6.2% |
| 8 | 207692 | 6.2% |
| 6 | 203782 | 6.1% |
| 0 | 202913 | 6.1% |
| 4 | 200145 | 6.0% |
| Other values (2) | 295294 | 8.8% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 8645974 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1157318 | 13.4% | |
| e | 537671 | 6.2% |
| a | 436559 | 5.0% |
| i | 389799 | 4.5% |
| t | 374730 | 4.3% |
| r | 331150 | 3.8% |
| n | 320223 | 3.7% |
| s | 310614 | 3.6% |
| l | 266766 | 3.1% |
| o | 262407 | 3.0% |
| Other values (52) | 4258737 |
city
Categorical
| Distinct | 894 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 14.0 MiB |
| Birmingham | 1649 |
|---|---|
| Phoenix | 1539 |
| Meridian | 1519 |
| Utica | 1502 |
| San Antonio | 1464 |
| Other values (889) |
Length
| Max length | 25 |
|---|---|
| Median length | 21 |
| Mean length | 8.6494706 |
| Min length | 3 |
Characters and Unicode
| Total characters | 3364670 |
|---|---|
| Distinct characters | 52 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 10 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | Pembroke Township |
|---|---|
| 2nd row | Newport |
| 3rd row | Rocky Mount |
| 4th row | Harrodsburg |
| 5th row | Prosperity |
Common Values
| Value | Count | Frequency (%) |
| Birmingham | 1649 | 0.4% |
| Phoenix | 1539 | 0.4% |
| Meridian | 1519 | 0.4% |
| Utica | 1502 | 0.4% |
| San Antonio | 1464 | 0.4% |
| Warren | 1419 | 0.4% |
| Thomas | 1393 | 0.4% |
| Conway | 1355 | 0.3% |
| Cleveland | 1338 | 0.3% |
| Burbank | 1293 | 0.3% |
| Other values (884) | 374532 |
Length
| Value | Count | Frequency (%) |
| city | 6387 | 1.3% |
| west | 5921 | 1.2% |
| north | 4349 | 0.9% |
| saint | 4344 | 0.9% |
| falls | 3886 | 0.8% |
| new | 3627 | 0.7% |
| lake | 3392 | 0.7% |
| mount | 3375 | 0.7% |
| san | 2989 | 0.6% |
| springs | 2604 | 0.5% |
| Other values (918) | 444563 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 327388 | 9.7% |
| a | 280684 | 8.3% |
| n | 246472 | 7.3% |
| o | 245079 | 7.3% |
| l | 233863 | 7.0% |
| r | 224644 | 6.7% |
| i | 210699 | 6.3% |
| t | 180033 | 5.4% |
| s | 134255 | 4.0% |
| 96434 | 2.9% | |
| Other values (42) | 1185119 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 2782211 | |
| Uppercase Letter | 485731 | 14.4% |
| Space Separator | 96434 | 2.9% |
| Dash Punctuation | 294 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 327388 | |
| a | 280684 | |
| n | 246472 | |
| o | 245079 | |
| l | 233863 | 8.4% |
| r | 224644 | 8.1% |
| i | 210699 | 7.6% |
| t | 180033 | 6.5% |
| s | 134255 | 4.8% |
| d | 92355 | 3.3% |
| Other values (15) | 606739 |
Uppercase Letter
| Value | Count | Frequency (%) |
| C | 47053 | 9.7% |
| M | 43980 | 9.1% |
| S | 40999 | 8.4% |
| B | 39792 | 8.2% |
| H | 34909 | 7.2% |
| W | 28700 | 5.9% |
| P | 27613 | 5.7% |
| L | 25983 | 5.3% |
| R | 23599 | 4.9% |
| A | 22440 | 4.6% |
| Other values (15) | 150663 |
Space Separator
| Value | Count | Frequency (%) |
| 96434 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 294 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 3267942 | |
| Common | 96728 | 2.9% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 327388 | 10.0% |
| a | 280684 | 8.6% |
| n | 246472 | 7.5% |
| o | 245079 | 7.5% |
| l | 233863 | 7.2% |
| r | 224644 | 6.9% |
| i | 210699 | 6.4% |
| t | 180033 | 5.5% |
| s | 134255 | 4.1% |
| d | 92355 | 2.8% |
| Other values (40) | 1092470 |
Common
| Value | Count | Frequency (%) |
| 96434 | ||
| - | 294 | 0.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 3364670 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 327388 | 9.7% |
| a | 280684 | 8.3% |
| n | 246472 | 7.3% |
| o | 245079 | 7.3% |
| l | 233863 | 7.0% |
| r | 224644 | 6.7% |
| i | 210699 | 6.3% |
| t | 180033 | 5.4% |
| s | 134255 | 4.0% |
| 96434 | 2.9% | |
| Other values (42) | 1185119 |
state
Categorical
| Distinct | 50 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 14.0 MiB |
| TX | |
|---|---|
| NY | 25049 |
| PA | 23990 |
| CA | 16970 |
| OH | 13915 |
| Other values (45) |
Length
| Max length | 2 |
|---|---|
| Median length | 2 |
| Mean length | 2 |
| Min length | 2 |
Characters and Unicode
| Total characters | 778006 |
|---|---|
| Distinct characters | 24 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | IL |
|---|---|
| 2nd row | ME |
| 3rd row | MO |
| 4th row | IN |
| 5th row | SC |
Common Values
| Value | Count | Frequency (%) |
| TX | 28360 | 7.3% |
| NY | 25049 | 6.4% |
| PA | 23990 | 6.2% |
| CA | 16970 | 4.4% |
| OH | 13915 | 3.6% |
| MI | 13860 | 3.6% |
| IL | 12972 | 3.3% |
| FL | 12878 | 3.3% |
| AL | 12223 | 3.1% |
| MO | 11346 | 2.9% |
| Other values (40) | 217440 |
Length
| Value | Count | Frequency (%) |
| tx | 28360 | 7.3% |
| ny | 25049 | 6.4% |
| pa | 23990 | 6.2% |
| ca | 16970 | 4.4% |
| oh | 13915 | 3.6% |
| mi | 13860 | 3.6% |
| il | 12972 | 3.3% |
| fl | 12878 | 3.3% |
| al | 12223 | 3.1% |
| mo | 11346 | 2.9% |
| Other values (40) | 217440 |
Most occurring characters
| Value | Count | Frequency (%) |
| A | 106791 | |
| N | 85557 | 11.0% |
| M | 66284 | 8.5% |
| I | 54610 | 7.0% |
| T | 46005 | 5.9% |
| L | 44397 | 5.7% |
| O | 42889 | 5.5% |
| C | 42359 | 5.4% |
| Y | 39311 | 5.1% |
| X | 28360 | 3.6% |
| Other values (14) | 221443 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 778006 |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 106791 | |
| N | 85557 | 11.0% |
| M | 66284 | 8.5% |
| I | 54610 | 7.0% |
| T | 46005 | 5.9% |
| L | 44397 | 5.7% |
| O | 42889 | 5.5% |
| C | 42359 | 5.4% |
| Y | 39311 | 5.1% |
| X | 28360 | 3.6% |
| Other values (14) | 221443 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 778006 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| A | 106791 | |
| N | 85557 | 11.0% |
| M | 66284 | 8.5% |
| I | 54610 | 7.0% |
| T | 46005 | 5.9% |
| L | 44397 | 5.7% |
| O | 42889 | 5.5% |
| C | 42359 | 5.4% |
| Y | 39311 | 5.1% |
| X | 28360 | 3.6% |
| Other values (14) | 221443 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 778006 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| A | 106791 | |
| N | 85557 | 11.0% |
| M | 66284 | 8.5% |
| I | 54610 | 7.0% |
| T | 46005 | 5.9% |
| L | 44397 | 5.7% |
| O | 42889 | 5.5% |
| C | 42359 | 5.4% |
| Y | 39311 | 5.1% |
| X | 28360 | 3.6% |
| Other values (14) | 221443 |
zip
Real number (ℝ)
| Distinct | 969 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 48763.684 |
| Minimum | 1257 |
|---|---|
| Maximum | 99783 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 14.0 MiB |
Quantile statistics
| Minimum | 1257 |
|---|---|
| 5-th percentile | 7208 |
| Q1 | 26237 |
| median | 48174 |
| Q3 | 72011 |
| 95-th percentile | 94569 |
| Maximum | 99783 |
| Range | 98526 |
| Interquartile range (IQR) | 45774 |
Descriptive statistics
| Standard deviation | 26892.358 |
|---|---|
| Coefficient of variation (CV) | 0.55148331 |
| Kurtosis | -1.0945937 |
| Mean | 48763.684 |
| Median Absolute Deviation (MAD) | 23058 |
| Skewness | 0.079981952 |
| Sum | 1.8969219 × 1010 |
| Variance | 7.231989 × 108 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 48088 | 1116 | 0.3% |
| 73754 | 1111 | 0.3% |
| 34112 | 1069 | 0.3% |
| 82514 | 1027 | 0.3% |
| 15484 | 1009 | 0.3% |
| 69165 | 976 | 0.3% |
| 26292 | 968 | 0.2% |
| 64019 | 958 | 0.2% |
| 5461 | 956 | 0.2% |
| 4287 | 946 | 0.2% |
| Other values (959) | 378867 |
| Value | Count | Frequency (%) |
| 1257 | 587 | |
| 1330 | 305 | 0.1% |
| 1535 | 161 | < 0.1% |
| 1545 | 295 | 0.1% |
| 1612 | 179 | < 0.1% |
| 1843 | 771 | |
| 1844 | 644 | |
| 2180 | 168 | < 0.1% |
| 2630 | 669 | |
| 2908 | 159 | < 0.1% |
| Value | Count | Frequency (%) |
| 99783 | 469 | |
| 99747 | 7 | < 0.1% |
| 99746 | 171 | < 0.1% |
| 99323 | 785 | |
| 99160 | 920 | |
| 99116 | 3 | < 0.1% |
| 99113 | 331 | 0.1% |
| 99033 | 731 | |
| 98836 | 162 | < 0.1% |
| 98665 | 150 | < 0.1% |
lat
Real number (ℝ)
| Distinct | 967 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 38.54639 |
| Minimum | 20.0271 |
|---|---|
| Maximum | 66.6933 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 14.0 MiB |
Quantile statistics
| Minimum | 20.0271 |
|---|---|
| 5-th percentile | 29.8826 |
| Q1 | 34.6205 |
| median | 39.3716 |
| Q3 | 41.9404 |
| 95-th percentile | 45.8433 |
| Maximum | 66.6933 |
| Range | 46.6662 |
| Interquartile range (IQR) | 7.3199 |
Descriptive statistics
| Standard deviation | 5.0849144 |
|---|---|
| Coefficient of variation (CV) | 0.13191675 |
| Kurtosis | 0.82936072 |
| Mean | 38.54639 |
| Median Absolute Deviation (MAD) | 3.3564 |
| Skewness | -0.18657805 |
| Sum | 14994661 |
| Variance | 25.856355 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 42.5164 | 1116 | 0.3% |
| 36.385 | 1111 | 0.3% |
| 26.1184 | 1069 | 0.3% |
| 43.0048 | 1027 | 0.3% |
| 39.8936 | 1009 | 0.3% |
| 41.1558 | 976 | 0.3% |
| 39.1505 | 968 | 0.2% |
| 38.7897 | 958 | 0.2% |
| 44.3346 | 956 | 0.2% |
| 44.0575 | 946 | 0.2% |
| Other values (957) | 378867 |
| Value | Count | Frequency (%) |
| 20.0271 | 476 | |
| 20.0827 | 309 | 0.1% |
| 24.6557 | 780 | |
| 26.1184 | 1069 | |
| 26.3304 | 177 | < 0.1% |
| 26.3771 | 159 | < 0.1% |
| 26.4215 | 907 | |
| 26.4722 | 760 | |
| 26.529 | 457 | |
| 26.6939 | 313 | 0.1% |
| Value | Count | Frequency (%) |
| 66.6933 | 7 | < 0.1% |
| 65.6899 | 171 | < 0.1% |
| 64.7556 | 469 | |
| 48.8878 | 920 | |
| 48.8856 | 621 | |
| 48.8328 | 479 | |
| 48.6669 | 339 | 0.1% |
| 48.6031 | 897 | |
| 48.4786 | 586 | |
| 48.34 | 907 |
long
Real number (ℝ)
| Distinct | 968 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -90.216346 |
| Minimum | -165.6723 |
|---|---|
| Maximum | -67.9503 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 389003 |
| Negative (%) | 100.0% |
| Memory size | 14.0 MiB |
Quantile statistics
| Minimum | -165.6723 |
|---|---|
| 5-th percentile | -119.0825 |
| Q1 | -96.798 |
| median | -87.4769 |
| Q3 | -80.158 |
| 95-th percentile | -73.5112 |
| Maximum | -67.9503 |
| Range | 97.722 |
| Interquartile range (IQR) | 16.64 |
Descriptive statistics
| Standard deviation | 13.767459 |
|---|---|
| Coefficient of variation (CV) | -0.15260493 |
| Kurtosis | 1.8805399 |
| Mean | -90.216346 |
| Median Absolute Deviation (MAD) | 8.1527 |
| Skewness | -1.1544279 |
| Sum | -35094429 |
| Variance | 189.54292 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| -82.9832 | 1116 | 0.3% |
| -98.0727 | 1111 | 0.3% |
| -81.7361 | 1069 | 0.3% |
| -108.8964 | 1027 | 0.3% |
| -79.7856 | 1009 | 0.3% |
| -101.136 | 976 | 0.3% |
| -79.503 | 968 | 0.2% |
| -93.8702 | 958 | 0.2% |
| -73.098 | 956 | 0.2% |
| -69.9656 | 946 | 0.2% |
| Other values (958) | 378867 |
| Value | Count | Frequency (%) |
| -165.6723 | 469 | |
| -156.292 | 171 | < 0.1% |
| -155.488 | 309 | |
| -155.3697 | 476 | |
| -153.994 | 7 | < 0.1% |
| -124.4409 | 314 | |
| -124.2174 | 487 | |
| -124.1587 | 329 | |
| -124.1437 | 456 | |
| -123.9743 | 626 |
| Value | Count | Frequency (%) |
| -67.9503 | 608 | |
| -68.5565 | 293 | 0.1% |
| -69.2675 | 155 | < 0.1% |
| -69.4828 | 647 | |
| -69.9576 | 167 | < 0.1% |
| -69.9656 | 946 | |
| -70.1031 | 3 | < 0.1% |
| -70.239 | 333 | 0.1% |
| -70.3001 | 669 | |
| -70.3457 | 435 |
city_pop
Real number (ℝ)
| Distinct | 878 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 88362.409 |
| Minimum | 23 |
|---|---|
| Maximum | 2906700 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 14.0 MiB |
Quantile statistics
| Minimum | 23 |
|---|---|
| 5-th percentile | 139 |
| Q1 | 743 |
| median | 2456 |
| Q3 | 20328 |
| 95-th percentile | 518429 |
| Maximum | 2906700 |
| Range | 2906677 |
| Interquartile range (IQR) | 19585 |
Descriptive statistics
| Standard deviation | 300908.65 |
|---|---|
| Coefficient of variation (CV) | 3.4053921 |
| Kurtosis | 38.040459 |
| Mean | 88362.409 |
| Median Absolute Deviation (MAD) | 2198 |
| Skewness | 5.6235064 |
| Sum | 3.4373242 × 1010 |
| Variance | 9.0546015 × 1010 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 606 | 1598 | 0.4% |
| 1312922 | 1539 | 0.4% |
| 1595797 | 1464 | 0.4% |
| 1766 | 1389 | 0.4% |
| 241 | 1337 | 0.3% |
| 302 | 1249 | 0.3% |
| 2906700 | 1248 | 0.3% |
| 276002 | 1246 | 0.3% |
| 198 | 1208 | 0.3% |
| 910148 | 1205 | 0.3% |
| Other values (868) | 375520 |
| Value | Count | Frequency (%) |
| 23 | 680 | |
| 37 | 287 | 0.1% |
| 43 | 599 | |
| 46 | 868 | |
| 47 | 150 | < 0.1% |
| 49 | 321 | 0.1% |
| 51 | 318 | 0.1% |
| 52 | 169 | < 0.1% |
| 53 | 778 | |
| 60 | 298 | 0.1% |
| Value | Count | Frequency (%) |
| 2906700 | 1248 | |
| 2504700 | 616 | |
| 2383912 | 149 | < 0.1% |
| 1595797 | 1464 | |
| 1577385 | 800 | |
| 1526206 | 1019 | |
| 1417793 | 2 | < 0.1% |
| 1382480 | 616 | |
| 1312922 | 1539 | |
| 1263321 | 1111 |
job
Categorical
| Distinct | 494 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 14.0 MiB |
| Film/video editor | 2939 |
|---|---|
| Exhibition designer | 2691 |
| Naval architect | 2619 |
| Surveyor, land/geomatics | 2595 |
| Designer, ceramics/pottery | 2531 |
| Other values (489) |
Length
| Max length | 59 |
|---|---|
| Median length | 38 |
| Mean length | 20.221502 |
| Min length | 3 |
Characters and Unicode
| Total characters | 7866225 |
|---|---|
| Distinct characters | 53 |
| Distinct categories | 6 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | Psychotherapist, child |
|---|---|
| 2nd row | Lexicographer |
| 3rd row | Tax inspector |
| 4th row | Social researcher |
| 5th row | Archaeologist |
Common Values
| Value | Count | Frequency (%) |
| Film/video editor | 2939 | 0.8% |
| Exhibition designer | 2691 | 0.7% |
| Naval architect | 2619 | 0.7% |
| Surveyor, land/geomatics | 2595 | 0.7% |
| Designer, ceramics/pottery | 2531 | 0.7% |
| Materials engineer | 2529 | 0.7% |
| Systems developer | 2325 | 0.6% |
| IT trainer | 2320 | 0.6% |
| Environmental consultant | 2241 | 0.6% |
| Financial adviser | 2229 | 0.6% |
| Other values (484) | 363984 |
Length
| Value | Count | Frequency (%) |
| engineer | 39829 | 4.6% |
| officer | 33394 | 3.9% |
| manager | 18282 | 2.1% |
| scientist | 16644 | 1.9% |
| designer | 15634 | 1.8% |
| surveyor | 14722 | 1.7% |
| teacher | 11391 | 1.3% |
| psychologist | 9874 | 1.1% |
| research | 8893 | 1.0% |
| editor | 8681 | 1.0% |
| Other values (456) | 685976 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 841424 | 10.7% |
| i | 715448 | 9.1% |
| r | 660585 | 8.4% |
| a | 543559 | 6.9% |
| t | 533350 | 6.8% |
| n | 529137 | 6.7% |
| 474317 | 6.0% | |
| o | 447733 | 5.7% |
| s | 432444 | 5.5% |
| c | 396711 | 5.0% |
| Other values (43) | 2291517 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 6833249 | |
| Space Separator | 474317 | 6.0% |
| Uppercase Letter | 410738 | 5.2% |
| Other Punctuation | 133629 | 1.7% |
| Close Punctuation | 7146 | 0.1% |
| Open Punctuation | 7146 | 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 841424 | |
| i | 715448 | |
| r | 660585 | |
| a | 543559 | 8.0% |
| t | 533350 | 7.8% |
| n | 529137 | 7.7% |
| o | 447733 | 6.6% |
| s | 432444 | 6.3% |
| c | 396711 | 5.8% |
| l | 299601 | 4.4% |
| Other values (16) | 1433257 |
Uppercase Letter
| Value | Count | Frequency (%) |
| C | 46982 | |
| E | 43615 | |
| P | 42886 | |
| S | 41177 | |
| T | 34071 | 8.3% |
| M | 26957 | 6.6% |
| A | 26283 | 6.4% |
| F | 20533 | 5.0% |
| D | 17527 | 4.3% |
| R | 16803 | 4.1% |
| Other values (11) | 93904 |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 93894 | |
| / | 37331 | 27.9% |
| ' | 2404 | 1.8% |
Space Separator
| Value | Count | Frequency (%) |
| 474317 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 7146 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 7146 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 7243987 | |
| Common | 622238 | 7.9% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 841424 | |
| i | 715448 | 9.9% |
| r | 660585 | 9.1% |
| a | 543559 | 7.5% |
| t | 533350 | 7.4% |
| n | 529137 | 7.3% |
| o | 447733 | 6.2% |
| s | 432444 | 6.0% |
| c | 396711 | 5.5% |
| l | 299601 | 4.1% |
| Other values (37) | 1843995 |
Common
| Value | Count | Frequency (%) |
| 474317 | ||
| , | 93894 | 15.1% |
| / | 37331 | 6.0% |
| ) | 7146 | 1.1% |
| ( | 7146 | 1.1% |
| ' | 2404 | 0.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 7866225 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 841424 | 10.7% |
| i | 715448 | 9.1% |
| r | 660585 | 8.4% |
| a | 543559 | 6.9% |
| t | 533350 | 6.8% |
| n | 529137 | 6.7% |
| 474317 | 6.0% | |
| o | 447733 | 5.7% |
| s | 432444 | 5.5% |
| c | 396711 | 5.0% |
| Other values (43) | 2291517 |
dob
Date
| Distinct | 967 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 14.0 MiB |
| Minimum | 1924-10-30 00:00:00 |
|---|---|
| Maximum | 2005-01-29 00:00:00 |
trans_num
Categorical
HIGH CARDINALITY  UNIFORM  UNIQUE 
| Distinct | 389003 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 14.0 MiB |
| d0e8265dc7e7b979c1533abebe95402c | 1 |
|---|---|
| dacdb6a22d977b027a8fb074a0447076 | 1 |
| 0d906bb5ceb81dcd3296e81b389341d5 | 1 |
| ba345b8475217cf2e0230e1fbfa9e4ce | 1 |
| c12970404cdc90e85349479ec5fa3909 | 1 |
| Other values (388998) |
Length
| Max length | 32 |
|---|---|
| Median length | 32 |
| Mean length | 32 |
| Min length | 32 |
Characters and Unicode
| Total characters | 12448096 |
|---|---|
| Distinct characters | 16 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 389003 ? |
|---|---|
| Unique (%) | 100.0% |
Sample
| 1st row | d0e8265dc7e7b979c1533abebe95402c |
|---|---|
| 2nd row | 71a947bf4d90e76e4d5b9a5f1b1ec8b4 |
| 3rd row | dfda32052a68f1452b1190b182672cb0 |
| 4th row | 8858dbc699716a100343e8402c3d2d17 |
| 5th row | b73f49b5c7081b864f1ba77678d86fce |
Common Values
| Value | Count | Frequency (%) |
| d0e8265dc7e7b979c1533abebe95402c | 1 | < 0.1% |
| dacdb6a22d977b027a8fb074a0447076 | 1 | < 0.1% |
| 0d906bb5ceb81dcd3296e81b389341d5 | 1 | < 0.1% |
| ba345b8475217cf2e0230e1fbfa9e4ce | 1 | < 0.1% |
| c12970404cdc90e85349479ec5fa3909 | 1 | < 0.1% |
| 0f8fef195889cbce74e53a0cf362e3a3 | 1 | < 0.1% |
| 6124557b02755ab42d8cd6aec1ab18b6 | 1 | < 0.1% |
| 39ba98c66e9ca69b98eef0b0fbc95abb | 1 | < 0.1% |
| 8c50ebb14119b5d29d4db8ed3e300b28 | 1 | < 0.1% |
| 5eb79aa6a1ac2ca00d5f6c6f9241e5b6 | 1 | < 0.1% |
| Other values (388993) | 388993 |
Length
| Value | Count | Frequency (%) |
| d0e8265dc7e7b979c1533abebe95402c | 1 | < 0.1% |
| e77d938381d3c533c24e2159a6af263a | 1 | < 0.1% |
| f0dccb373dbb52f87993b587cdac5338 | 1 | < 0.1% |
| 4f7096fc7703ec9af052a743a2f6e2f4 | 1 | < 0.1% |
| 570a1cba31bb2b0f2421740a8b599f6b | 1 | < 0.1% |
| 09e9b830ca58e5c71a23748a3d05f869 | 1 | < 0.1% |
| 12cfa522315fd07b4fcf44dd752203f4 | 1 | < 0.1% |
| 871b92a678ed78cef4ec01e64a60bcd8 | 1 | < 0.1% |
| 5aa5735c69932b6c476b9c8910490e2c | 1 | < 0.1% |
| 75e691168933aafe9e0157ded10c971d | 1 | < 0.1% |
| Other values (388993) | 388993 |
Most occurring characters
| Value | Count | Frequency (%) |
| 9 | 779268 | 6.3% |
| 7 | 778763 | 6.3% |
| c | 778747 | 6.3% |
| 3 | 778533 | 6.3% |
| 4 | 778318 | 6.3% |
| d | 778169 | 6.3% |
| 1 | 778128 | 6.3% |
| 5 | 778046 | 6.3% |
| 8 | 778030 | 6.3% |
| a | 778030 | 6.3% |
| Other values (6) | 4664064 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 7781134 | |
| Lowercase Letter | 4666962 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 9 | 779268 | |
| 7 | 778763 | |
| 3 | 778533 | |
| 4 | 778318 | |
| 1 | 778128 | |
| 5 | 778046 | |
| 8 | 778030 | |
| 6 | 777811 | |
| 0 | 777165 | |
| 2 | 777072 |
Lowercase Letter
| Value | Count | Frequency (%) |
| c | 778747 | |
| d | 778169 | |
| a | 778030 | |
| e | 778017 | |
| f | 777705 | |
| b | 776294 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 7781134 | |
| Latin | 4666962 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 9 | 779268 | |
| 7 | 778763 | |
| 3 | 778533 | |
| 4 | 778318 | |
| 1 | 778128 | |
| 5 | 778046 | |
| 8 | 778030 | |
| 6 | 777811 | |
| 0 | 777165 | |
| 2 | 777072 |
Latin
| Value | Count | Frequency (%) |
| c | 778747 | |
| d | 778169 | |
| a | 778030 | |
| e | 778017 | |
| f | 777705 | |
| b | 776294 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 12448096 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 9 | 779268 | 6.3% |
| 7 | 778763 | 6.3% |
| c | 778747 | 6.3% |
| 3 | 778533 | 6.3% |
| 4 | 778318 | 6.3% |
| d | 778169 | 6.3% |
| 1 | 778128 | 6.3% |
| 5 | 778046 | 6.3% |
| 8 | 778030 | 6.3% |
| a | 778030 | 6.3% |
| Other values (6) | 4664064 |
unix_time
Real number (ℝ)
| Distinct | 387001 |
|---|---|
| Distinct (%) | 99.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.3492679 × 109 |
| Minimum | 1.3253761 × 109 |
|---|---|
| Maximum | 1.3718168 × 109 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 14.0 MiB |
Quantile statistics
| Minimum | 1.3253761 × 109 |
|---|---|
| 5-th percentile | 1.3286996 × 109 |
| Q1 | 1.3387632 × 109 |
| median | 1.3492668 × 109 |
| Q3 | 1.3594863 × 109 |
| 95-th percentile | 1.3698414 × 109 |
| Maximum | 1.3718168 × 109 |
| Range | 46440765 |
| Interquartile range (IQR) | 20723125 |
Descriptive statistics
| Standard deviation | 12844910 |
|---|---|
| Coefficient of variation (CV) | 0.0095199102 |
| Kurtosis | -1.0883381 |
| Mean | 1.3492679 × 109 |
| Median Absolute Deviation (MAD) | 10386412 |
| Skewness | 0.0022487089 |
| Sum | 5.2486927 × 1014 |
| Variance | 1.649917 × 1014 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1342848987 | 3 | < 0.1% |
| 1348375471 | 3 | < 0.1% |
| 1339879107 | 3 | < 0.1% |
| 1347630495 | 3 | < 0.1% |
| 1344278185 | 3 | < 0.1% |
| 1344083264 | 3 | < 0.1% |
| 1331300462 | 2 | < 0.1% |
| 1349529879 | 2 | < 0.1% |
| 1337965573 | 2 | < 0.1% |
| 1333834607 | 2 | < 0.1% |
| Other values (386991) | 388977 |
| Value | Count | Frequency (%) |
| 1325376051 | 1 | |
| 1325376282 | 1 | |
| 1325376308 | 1 | |
| 1325376383 | 1 | |
| 1325376416 | 1 | |
| 1325376543 | 1 | |
| 1325376788 | 1 | |
| 1325376877 | 1 | |
| 1325377060 | 1 | |
| 1325377356 | 1 |
| Value | Count | Frequency (%) |
| 1371816816 | 1 | |
| 1371816683 | 1 | |
| 1371816512 | 1 | |
| 1371816488 | 1 | |
| 1371816474 | 1 | |
| 1371816383 | 1 | |
| 1371816372 | 1 | |
| 1371816179 | 1 | |
| 1371815931 | 1 | |
| 1371815849 | 1 |
merch_lat
Real number (ℝ)
| Distinct | 384428 |
|---|---|
| Distinct (%) | 98.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 38.546535 |
| Minimum | 19.027785 |
|---|---|
| Maximum | 67.510267 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 14.0 MiB |
Quantile statistics
| Minimum | 19.027785 |
|---|---|
| 5-th percentile | 29.73821 |
| Q1 | 34.733977 |
| median | 39.383177 |
| Q3 | 41.967087 |
| 95-th percentile | 46.032372 |
| Maximum | 67.510267 |
| Range | 48.482482 |
| Interquartile range (IQR) | 7.23311 |
Descriptive statistics
| Standard deviation | 5.1194179 |
|---|---|
| Coefficient of variation (CV) | 0.13281136 |
| Kurtosis | 0.80996415 |
| Mean | 38.546535 |
| Median Absolute Deviation (MAD) | 3.393313 |
| Skewness | -0.18271084 |
| Sum | 14994718 |
| Variance | 26.20844 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 39.674306 | 3 | < 0.1% |
| 43.335386 | 3 | < 0.1% |
| 35.465297 | 3 | < 0.1% |
| 38.745859 | 3 | < 0.1% |
| 41.270604 | 3 | < 0.1% |
| 40.822878 | 3 | < 0.1% |
| 39.467059 | 3 | < 0.1% |
| 41.271468 | 3 | < 0.1% |
| 39.866419 | 3 | < 0.1% |
| 40.534977 | 3 | < 0.1% |
| Other values (384418) | 388973 |
| Value | Count | Frequency (%) |
| 19.027785 | 1 | |
| 19.033288 | 1 | |
| 19.034282 | 1 | |
| 19.034687 | 1 | |
| 19.036312 | 1 | |
| 19.03922 | 1 | |
| 19.04188 | 1 | |
| 19.048001 | 1 | |
| 19.052896 | 1 | |
| 19.054697 | 1 |
| Value | Count | Frequency (%) |
| 67.510267 | 1 | |
| 67.397018 | 1 | |
| 67.064277 | 1 | |
| 66.835174 | 1 | |
| 66.65822 | 1 | |
| 66.653465 | 1 | |
| 66.645176 | 1 | |
| 66.624674 | 1 | |
| 66.609969 | 1 | |
| 66.598747 | 1 |
merch_long
Real number (ℝ)
| Distinct | 387072 |
|---|---|
| Distinct (%) | 99.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -90.21776 |
| Minimum | -166.67013 |
|---|---|
| Maximum | -66.955996 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 389003 |
| Negative (%) | 100.0% |
| Memory size | 14.0 MiB |
Quantile statistics
| Minimum | -166.67013 |
|---|---|
| 5-th percentile | -119.30529 |
| Q1 | -96.896077 |
| median | -87.443539 |
| Q3 | -80.216844 |
| 95-th percentile | -73.337194 |
| Maximum | -66.955996 |
| Range | 99.714136 |
| Interquartile range (IQR) | 16.679233 |
Descriptive statistics
| Standard deviation | 13.779253 |
|---|---|
| Coefficient of variation (CV) | -0.15273326 |
| Kurtosis | 1.8717293 |
| Mean | -90.21776 |
| Median Absolute Deviation (MAD) | 8.243111 |
| Skewness | -1.1508036 |
| Sum | -35094979 |
| Variance | 189.86781 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| -95.815432 | 3 | < 0.1% |
| -92.175166 | 3 | < 0.1% |
| -72.721662 | 3 | < 0.1% |
| -82.00945 | 3 | < 0.1% |
| -81.440297 | 3 | < 0.1% |
| -83.06618 | 3 | < 0.1% |
| -89.564406 | 3 | < 0.1% |
| -87.247803 | 2 | < 0.1% |
| -90.10778 | 2 | < 0.1% |
| -97.653063 | 2 | < 0.1% |
| Other values (387062) | 388976 |
| Value | Count | Frequency (%) |
| -166.670132 | 1 | |
| -166.669638 | 1 | |
| -166.659277 | 1 | |
| -166.657174 | 1 | |
| -166.656219 | 1 | |
| -166.649771 | 1 | |
| -166.64352 | 1 | |
| -166.642151 | 1 | |
| -166.63998 | 1 | |
| -166.629875 | 1 |
| Value | Count | Frequency (%) |
| -66.955996 | 1 | |
| -66.958751 | 1 | |
| -66.961923 | 1 | |
| -66.963975 | 1 | |
| -66.967742 | 1 | |
| -66.970769 | 1 | |
| -66.979887 | 1 | |
| -66.983261 | 1 | |
| -66.983329 | 1 | |
| -66.984433 | 1 |
is_fraud
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 14.0 MiB |
| 0 | |
|---|---|
| 1 | 2252 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 389003 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 386751 | |
| 1 | 2252 | 0.6% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 386751 | |
| 1 | 2252 | 0.6% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 386751 | |
| 1 | 2252 | 0.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 389003 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 386751 | |
| 1 | 2252 | 0.6% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 389003 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 386751 | |
| 1 | 2252 | 0.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 389003 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 386751 | |
| 1 | 2252 | 0.6% |
| cc_num | amt | zip | lat | long | city_pop | unix_time | merch_lat | merch_long | category | gender | state | is_fraud | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| cc_num | 1.000 | -0.002 | 0.017 | -0.007 | -0.017 | 0.050 | 0.003 | -0.007 | -0.016 | 0.065 | 0.999 | 0.999 | 0.315 |
| amt | -0.002 | 1.000 | 0.001 | 0.013 | 0.000 | -0.025 | 0.001 | 0.013 | 0.000 | 0.021 | 0.000 | 0.000 | 0.000 |
| zip | 0.017 | 0.001 | 1.000 | -0.162 | -0.959 | -0.042 | 0.001 | -0.161 | -0.957 | 0.065 | 0.990 | 0.999 | 0.311 |
| lat | -0.007 | 0.013 | -0.162 | 1.000 | 0.105 | -0.266 | 0.003 | 0.991 | 0.103 | 0.011 | 0.101 | 0.799 | 0.011 |
| long | -0.017 | 0.000 | -0.959 | 0.105 | 1.000 | 0.089 | -0.003 | 0.105 | 0.998 | 0.008 | 0.090 | 0.922 | 0.005 |
| city_pop | 0.050 | -0.025 | -0.042 | -0.266 | 0.089 | 1.000 | -0.003 | -0.265 | 0.088 | 0.013 | 0.090 | 0.314 | 0.006 |
| unix_time | 0.003 | 0.001 | 0.001 | 0.003 | -0.003 | -0.003 | 1.000 | 0.003 | -0.003 | 0.000 | 0.000 | 0.003 | 0.017 |
| merch_lat | -0.007 | 0.013 | -0.161 | 0.991 | 0.105 | -0.265 | 0.003 | 1.000 | 0.103 | 0.011 | 0.103 | 0.813 | 0.010 |
| merch_long | -0.016 | 0.000 | -0.957 | 0.103 | 0.998 | 0.088 | -0.003 | 0.103 | 1.000 | 0.009 | 0.081 | 0.885 | 0.005 |
| category | 0.065 | 0.021 | 0.065 | 0.011 | 0.008 | 0.013 | 0.000 | 0.011 | 0.009 | 1.000 | 0.052 | 0.019 | 0.069 |
| gender | 0.999 | 0.000 | 0.990 | 0.101 | 0.090 | 0.090 | 0.000 | 0.103 | 0.081 | 0.052 | 1.000 | 0.257 | 0.005 |
| state | 0.999 | 0.000 | 0.999 | 0.799 | 0.922 | 0.314 | 0.003 | 0.813 | 0.885 | 0.019 | 0.257 | 1.000 | 0.014 |
| is_fraud | 0.315 | 0.000 | 0.311 | 0.011 | 0.005 | 0.006 | 0.017 | 0.010 | 0.005 | 0.069 | 0.005 | 0.014 | 1.000 |
| amt | lat | long | city_pop | unix_time | merch_lat | merch_long | is_fraud | |
|---|---|---|---|---|---|---|---|---|
| amt | 1.000 | -0.001 | -0.000 | 0.005 | 0.001 | -0.001 | -0.000 | 0.207 |
| lat | -0.001 | 1.000 | -0.016 | -0.155 | 0.003 | 0.994 | -0.016 | 0.002 |
| long | -0.000 | -0.016 | 1.000 | -0.052 | -0.003 | -0.015 | 0.999 | 0.003 |
| city_pop | 0.005 | -0.155 | -0.052 | 1.000 | -0.001 | -0.154 | -0.052 | 0.002 |
| unix_time | 0.001 | 0.003 | -0.003 | -0.001 | 1.000 | 0.003 | -0.003 | -0.005 |
| merch_lat | -0.001 | 0.994 | -0.015 | -0.154 | 0.003 | 1.000 | -0.015 | 0.002 |
| merch_long | -0.000 | -0.016 | 0.999 | -0.052 | -0.003 | -0.015 | 1.000 | 0.003 |
| is_fraud | 0.207 | 0.002 | 0.003 | 0.002 | -0.005 | 0.002 | 0.003 | 1.000 |
| amt | lat | long | city_pop | unix_time | merch_lat | merch_long | is_fraud | |
|---|---|---|---|---|---|---|---|---|
| amt | 1.000 | 0.013 | 0.000 | -0.025 | 0.001 | 0.013 | 0.000 | 0.087 |
| lat | 0.013 | 1.000 | 0.105 | -0.266 | 0.003 | 0.991 | 0.103 | 0.002 |
| long | 0.000 | 0.105 | 1.000 | 0.089 | -0.003 | 0.105 | 0.998 | 0.004 |
| city_pop | -0.025 | -0.266 | 0.089 | 1.000 | -0.003 | -0.265 | 0.088 | 0.003 |
| unix_time | 0.001 | 0.003 | -0.003 | -0.003 | 1.000 | 0.003 | -0.003 | -0.004 |
| merch_lat | 0.013 | 0.991 | 0.105 | -0.265 | 0.003 | 1.000 | 0.103 | 0.001 |
| merch_long | 0.000 | 0.103 | 0.998 | 0.088 | -0.003 | 0.103 | 1.000 | 0.004 |
| is_fraud | 0.087 | 0.002 | 0.004 | 0.003 | -0.004 | 0.001 | 0.004 | 1.000 |
| amt | lat | long | city_pop | unix_time | merch_lat | merch_long | is_fraud | |
|---|---|---|---|---|---|---|---|---|
| amt | 1.000 | 0.009 | 0.000 | -0.017 | 0.001 | 0.009 | 0.000 | 0.071 |
| lat | 0.009 | 1.000 | 0.085 | -0.178 | 0.002 | 0.920 | 0.083 | 0.001 |
| long | 0.000 | 0.085 | 1.000 | 0.062 | -0.002 | 0.084 | 0.966 | 0.004 |
| city_pop | -0.017 | -0.178 | 0.062 | 1.000 | -0.002 | -0.177 | 0.061 | 0.002 |
| unix_time | 0.001 | 0.002 | -0.002 | -0.002 | 1.000 | 0.002 | -0.002 | -0.003 |
| merch_lat | 0.009 | 0.920 | 0.084 | -0.177 | 0.002 | 1.000 | 0.082 | 0.001 |
| merch_long | 0.000 | 0.083 | 0.966 | 0.061 | -0.002 | 0.082 | 1.000 | 0.003 |
| is_fraud | 0.071 | 0.001 | 0.004 | 0.002 | -0.003 | 0.001 | 0.003 | 1.000 |
| cc_num | category | amt | gender | state | zip | lat | long | city_pop | unix_time | merch_lat | merch_long | is_fraud | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| cc_num | 1.000 | 0.016 | 0.000 | 0.031 | 0.439 | 0.145 | 0.257 | 0.190 | 0.082 | 0.000 | 0.188 | 0.175 | 0.002 |
| category | 0.016 | 1.000 | 0.047 | 0.067 | 0.067 | 0.027 | 0.024 | 0.019 | 0.030 | 0.000 | 0.024 | 0.019 | 0.088 |
| amt | 0.000 | 0.047 | 1.000 | 0.000 | 0.000 | 0.000 | 0.007 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 |
| gender | 0.031 | 0.067 | 0.000 | 1.000 | 0.323 | 0.159 | 0.135 | 0.109 | 0.120 | 0.000 | 0.137 | 0.108 | 0.007 |
| state | 0.439 | 0.067 | 0.000 | 0.323 | 1.000 | 1.000 | 0.965 | 0.990 | 0.657 | 0.010 | 0.969 | 0.985 | 0.018 |
| zip | 0.145 | 0.027 | 0.000 | 0.159 | 1.000 | 1.000 | 0.679 | 0.853 | 0.271 | 0.003 | 0.674 | 0.845 | 0.001 |
| lat | 0.257 | 0.024 | 0.007 | 0.135 | 0.965 | 0.679 | 1.000 | 0.884 | 0.296 | 0.007 | 0.992 | 0.865 | 0.015 |
| long | 0.190 | 0.019 | 0.000 | 0.109 | 0.990 | 0.853 | 0.884 | 1.000 | 0.317 | 0.002 | 0.912 | 0.996 | 0.008 |
| city_pop | 0.082 | 0.030 | 0.000 | 0.120 | 0.657 | 0.271 | 0.296 | 0.317 | 1.000 | 0.007 | 0.270 | 0.332 | 0.008 |
| unix_time | 0.000 | 0.000 | 0.000 | 0.000 | 0.010 | 0.003 | 0.007 | 0.002 | 0.007 | 1.000 | 0.007 | 0.004 | 0.022 |
| merch_lat | 0.188 | 0.024 | 0.000 | 0.137 | 0.969 | 0.674 | 0.992 | 0.912 | 0.270 | 0.007 | 1.000 | 0.895 | 0.014 |
| merch_long | 0.175 | 0.019 | 0.000 | 0.108 | 0.985 | 0.845 | 0.865 | 0.996 | 0.332 | 0.004 | 0.895 | 1.000 | 0.006 |
| is_fraud | 0.002 | 0.088 | 0.000 | 0.007 | 0.018 | 0.001 | 0.015 | 0.008 | 0.008 | 0.022 | 0.014 | 0.006 | 1.000 |
| state | is_fraud | gender | category | |
|---|---|---|---|---|
| state | 1.000 | 0.014 | 0.257 | 0.019 |
| is_fraud | 0.014 | 1.000 | 0.005 | 0.069 |
| gender | 0.257 | 0.005 | 1.000 | 0.052 |
| category | 0.019 | 0.069 | 0.052 | 1.000 |
| trans_date_trans_time | cc_num | merchant | category | amt | first | last | gender | street | city | state | zip | lat | long | city_pop | job | dob | trans_num | unix_time | merch_lat | merch_long | is_fraud | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 468195 | 2019-07-25 16:20:35 | 4587657402165341815 | fraud_Hills-Witting | shopping_net | 265.89 | Amber | Lewis | F | 6296 John Keys Suite 858 | Pembroke Township | IL | 60958 | 41.0646 | -87.5917 | 2135 | Psychotherapist, child | 2004-05-08 | d0e8265dc7e7b979c1533abebe95402c | 1343233235 | 40.991185 | -88.538586 | 0 |
| 694039 | 2019-10-23 04:47:17 | 347612609554823 | fraud_Kling Inc | gas_transport | 68.21 | Robert | James | M | 18316 Cannon Place | Newport | ME | 4953 | 44.8393 | -69.2675 | 3228 | Lexicographer | 1995-12-28 | 71a947bf4d90e76e4d5b9a5f1b1ec8b4 | 1350967637 | 44.680256 | -69.390510 | 0 |
| 1092424 | 2020-03-30 09:50:38 | 6011652924285713 | fraud_DuBuque LLC | grocery_pos | 95.39 | Kathryn | Smith | F | 19838 Tonya Prairie Apt. 947 | Rocky Mount | MO | 65072 | 38.2911 | -92.7059 | 1847 | Tax inspector | 1988-10-26 | dfda32052a68f1452b1190b182672cb0 | 1364637038 | 38.954895 | -91.927764 | 0 |
| 47679 | 2019-01-28 22:32:55 | 4839615922685395 | fraud_Grimes LLC | entertainment | 21.39 | Phillip | Robertson | M | 85344 Smith Gateway Apt. 280 | Harrodsburg | IN | 47434 | 39.0130 | -86.5457 | 76 | Social researcher | 1955-05-06 | 8858dbc699716a100343e8402c3d2d17 | 1327789975 | 39.440657 | -85.829947 | 0 |
| 226502 | 2019-04-24 15:36:30 | 4989847570577635369 | fraud_Ullrich Ltd | kids_pets | 36.53 | Vanessa | Anderson | F | 21178 Brittney Locks | Prosperity | SC | 29127 | 34.1832 | -81.5324 | 8333 | Archaeologist | 1994-07-09 | b73f49b5c7081b864f1ba77678d86fce | 1335281790 | 34.267058 | -80.615114 | 0 |
| 610017 | 2019-09-16 03:13:48 | 4939976756738216 | fraud_Rodriguez, Yost and Jenkins | misc_net | 259.30 | Michelle | Johnston | F | 3531 Hamilton Highway | Roma | TX | 78584 | 26.4215 | -99.0025 | 18128 | IT trainer | 1990-11-07 | f0dccb373dbb52f87993b587cdac5338 | 1347765228 | 27.403295 | -99.484687 | 0 |
| 1161197 | 2020-04-28 23:06:17 | 4334230547694630 | fraud_Padberg-Sauer | home | 25.58 | Scott | Martin | M | 7483 Navarro Flats | Freedom | WY | 83120 | 43.0172 | -111.0292 | 471 | Education officer, museum | 1967-08-02 | 4f7096fc7703ec9af052a743a2f6e2f4 | 1367190377 | 43.862172 | -110.363955 | 0 |
| 174936 | 2019-04-01 14:28:52 | 30074693890476 | fraud_Stiedemann Ltd | food_dining | 64.68 | Kelsey | Richards | F | 889 Sarah Station Suite 624 | Holcomb | KS | 67851 | 37.9931 | -100.9893 | 2691 | Arboriculturist | 1993-08-16 | 570a1cba31bb2b0f2421740a8b599f6b | 1333290532 | 38.967083 | -100.174342 | 0 |
| 1054838 | 2020-03-13 23:32:59 | 630423337322 | fraud_Jacobi and Sons | shopping_pos | 7.20 | Stephanie | Gill | F | 43039 Riley Greens Suite 393 | Orient | WA | 99160 | 48.8878 | -118.2105 | 149 | Special educational needs teacher | 1978-06-21 | 09e9b830ca58e5c71a23748a3d05f869 | 1363217579 | 48.743502 | -118.917042 | 0 |
| 255480 | 2019-05-06 23:45:27 | 30143535920989 | fraud_Kuhn Group | food_dining | 12.49 | Lisa | Collins | F | 44197 Jeffrey Port Suite 050 | Bridgeport | NJ | 8014 | 39.8016 | -75.3478 | 504 | Engineer, control and instrumentation | 1980-08-17 | 12cfa522315fd07b4fcf44dd752203f4 | 1336347927 | 39.041952 | -76.224140 | 0 |
| trans_date_trans_time | cc_num | merchant | category | amt | first | last | gender | street | city | state | zip | lat | long | city_pop | job | dob | trans_num | unix_time | merch_lat | merch_long | is_fraud | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 558028 | 2019-08-25 23:40:51 | 4128027264554082 | fraud_Schmitt Ltd | misc_net | 851.68 | Kyle | Park | M | 7507 Larry Passage Suite 859 | Mount Perry | OH | 43760 | 39.8788 | -82.1880 | 1831 | Barrister's clerk | 1953-10-18 | 1737ee3f0ef76a932163fb04546137a2 | 1345938051 | 40.151813 | -81.664476 | 1 |
| 1010261 | 2020-02-20 05:52:45 | 2233882705243596 | fraud_Bauch-Raynor | grocery_pos | 344.93 | Jamie | Robinson | F | 67089 Caitlin Meadow Apt. 905 | Sturgis | MS | 39769 | 33.3570 | -89.0473 | 1923 | Medical physicist | 1960-01-16 | dac481866f20ffe19b1725cd2275c7be | 1361339565 | 33.339171 | -88.613480 | 1 |
| 928889 | 2020-01-03 23:52:39 | 4756039869079882102 | fraud_Block-Parisian | misc_net | 757.90 | Francisco | Hernandez | M | 980 Smith Gardens | Gainesville | TX | 76240 | 33.6547 | -97.1583 | 26120 | Engineer, manufacturing | 1954-01-06 | 043e396418a61a663559a4f65e99de74 | 1357257159 | 33.937975 | -96.382509 | 1 |
| 1101557 | 2020-04-03 10:59:44 | 345832460465610 | fraud_Robel, Cummerata and Prosacco | gas_transport | 9.11 | Jason | Mcmahon | M | 6385 Donald Square Suite 429 | Springfield | VA | 22151 | 38.8029 | -77.2116 | 104396 | Production engineer | 1950-11-20 | b991672c504a79471a46d05f02ca4109 | 1364986784 | 38.454841 | -78.157858 | 1 |
| 570534 | 2019-08-30 23:32:57 | 30596478689301 | fraud_Kassulke PLC | shopping_net | 967.49 | Daniel | Graham | M | 28223 Ward Summit Apt. 664 | Arvada | CO | 80005 | 39.8422 | -105.1097 | 122111 | Hotel manager | 1987-05-23 | 9458505d34c58daa2893a825ce9aec3c | 1346369577 | 39.262510 | -105.656175 | 1 |
| 567879 | 2019-08-29 22:07:20 | 3526826139003047 | fraud_Block-Parisian | misc_net | 853.27 | Nathan | Massey | M | 5783 Evan Roads Apt. 465 | Falmouth | MI | 49632 | 44.2529 | -85.0170 | 1126 | Furniture designer | 1955-07-06 | 91b28f3102c9055893488d23afde134e | 1346278040 | 43.357421 | -85.089403 | 1 |
| 1000983 | 2020-02-15 02:30:12 | 30235438713303 | fraud_Durgan-Auer | misc_net | 750.98 | James | Baldwin | M | 3603 Mitchell Court | Winfield | WV | 25213 | 38.5072 | -81.8900 | 5512 | Exhibition designer | 1980-03-24 | e21fba3f053af7ea0bd31dfed1ddf2e4 | 1360895412 | 37.557927 | -81.170028 | 1 |
| 588497 | 2019-09-07 04:27:01 | 377264520876399 | fraud_Schoen, Kuphal and Nitzsche | grocery_pos | 309.71 | Kara | Miles | F | 2076 Thomas Roads Suite 970 | Cassatt | SC | 29032 | 34.3424 | -80.5000 | 4424 | Lawyer | 1961-07-31 | 0cdec150cbb24a0e689a11001e9f0b15 | 1346992021 | 34.178315 | -79.758965 | 1 |
| 1233466 | 2020-05-30 02:45:14 | 6011681934117244 | fraud_Koepp-Parker | grocery_pos | 328.28 | Kaitlyn | Newman | F | 098 Stewart Hill | Slayden | TN | 37165 | 36.2835 | -87.4581 | 70 | Prison officer | 1956-06-22 | c8a8ae7f2d176c8235b20babb0ee8b80 | 1369881914 | 36.153313 | -86.646337 | 1 |
| 1046846 | 2020-03-10 01:28:30 | 3589289942931264 | fraud_Marks Inc | gas_transport | 7.61 | Paula | Estrada | F | 350 Stacy Glens | Spencer | SD | 57374 | 43.7557 | -97.5936 | 343 | Development worker, international aid | 1972-03-05 | 2cbedfecdb3594a19c965e57b1b885c0 | 1362878910 | 43.356988 | -96.770897 | 1 |